618 results found.
Written
Morphology,
Language Type:
Monolingual
Languages:
German
Availability:
Freely Available
License:
Size:
189 MByte Production Status:
Existing-used
Use:
Morphological Analysis
-
Paper title:A Corpus of German Reddit Exchanges (GeRedE)
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Andreas Blombach | Stuttgart MORPhology (SMOR) | /N |
Documentation:
https://www.ims.uni-stuttgart.de/documents/ressourcen/werkzeuge/smor/ (in German)
Written
Corpus,
Language Type:
Bilingual
Languages:
English German
Availability:
Freely Available
License:
CreativeCommons
Size:
5 GByte Production Status:
Newly created-finished
Use:
Evaluation/Validation
-
Paper title:CEASR: A Corpus for Evaluating Automatic Speech Recognition
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Malgorzata Anna Ulasik | CEASR | /N |
Documentation:
The documentation is being created now, will be published on resource website in due time. It is written in English.
Written
Corpus,
Language Type:
Multilingual
Languages:
French German Luxembourgish
Availability:
Freely Available
License:
Multiple
Size:
10,000,000 tokens Production Status:
Newly created-finished
Use:
multiple uses
-
Paper title:Language Resources for Historical Newspapers: the Impresso Collection
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Maud Ehrmann | Impresso Historical Newspaper Textual Material | /N |
Documentation:
Yes, english
Written
OCR quality assessment of large historical newspaper corpus,
Language Type:
Multilingual
Languages:
French German Luxembourgish
Availability:
Freely Available
License:
CC BY 4.0
Size:
None Production Status:
Use:
multiple uses
-
Paper title:Language Resources for Historical Newspapers: the Impresso Collection
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Maud Ehrmann | Impresso OCR Quality Assessment | /N |
Documentation:
None
Written
Evaluation Data,
Language Type:
Monolingual
Languages:
German
Availability:
Freely Available
License:
CC BY SA 4.0
Size:
None Production Status:
Newly created-finished
Use:
Evaluation/Validation
-
Paper title:Language Resources for Historical Newspapers: the Impresso Collection
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Maud Ehrmann | Impresso OCR ground truth | /N |
Documentation:
None
Written
Evaluation Data,
Language Type:
Multilingual
Languages:
English French German
Availability:
Freely Available
License:
CC BY SA 4.0
Size:
None Production Status:
Newly created-finished
Use:
Evaluation/Validation
-
Paper title:Language Resources for Historical Newspapers: the Impresso Collection
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Maud Ehrmann | Impresso HIPE Shared Task Named Entity Gold Standard | /N |
Documentation:
None
Written
Lexicon,
Language Type:
Multilingual
Languages:
French German Luxembourgish
Availability:
Freely Available
License:
CC BY 4.0
Size:
None Production Status:
Newly created-finished
Use:
Language Modelling
-
Paper title:Language Resources for Historical Newspapers: the Impresso Collection
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Maud Ehrmann | Impresso Word Embeddings | /N |
Documentation:
None
Written
Topic Modelling Data as extracted from historical newspapers,
Language Type:
Bilingual
Languages:
French German
Availability:
Freely Available
License:
CC BY 4.0
Size:
None Production Status:
Newly created-finished
Use:
multiple uses
-
Paper title:Language Resources for Historical Newspapers: the Impresso Collection
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Maud Ehrmann | Impresso Topic Modelling Data | /N |
Documentation:
None
Written
Text Reuse Data as extracted from historical newspapers,
Language Type:
Multilingual
Languages:
French German Luxembourgish
Availability:
Freely Available
License:
CC BY SA 4.0
Size:
None Production Status:
Newly created-finished
Use:
multiple uses
-
Paper title:Language Resources for Historical Newspapers: the Impresso Collection
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Maud Ehrmann | Impresso Text Reuse Data | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
German
Availability:
Freely Available
License:
CreativeCommons
Size:
22183627 tokens Production Status:
Newly created-finished
Use:
Corpus Creation/Annotation
-
Paper title:Allgemeine Musikalische Zeitung as a Searchable Online Corpus
-
Paper track:Written/poster presentation with demo
-
Paper status:Accept Poster+Demo
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Bernd Kampe | Allgemeine Musikalische Zeitung | /N |
Documentation:
There is code available that demonstrates some of the steps needed to create the corpus at https://github.com/JULIELab/romantik-zeitungen




